Statistical resolution of scope ambiguity in natural language

نویسنده

  • Galen Andrew
چکیده

A crucial obstacle is that there is no labeled corpus available that is suitable for learning to resolve scope ambiguities. As a consequence, we’ve had to generate our own data set by (a) selecting sentences appropriate to our purposes, and (b) hand-labeling the sentences selected. We recognize that this represents a methodological compromise: our results would carry more weight if the training data, and especially the test data, had not been generated by the researchers themselves. However, we’ve attempted to mitigate this shortcoming by establishing in advance clear guidelines for selecting and labeling sentences. We obtained our data from sentences drawn from GRE and LSAT logic games. We chose this limited domain because these sentences are specifically designed to have an interpretation that is unambiguous to a human reader. Thus there is less subjectivity about the correct labelings, and determining the correct scoping is less likely to depend on context and pragmatics.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semantic Priming Effect on Relative Clause Attachment Ambiguity Resolution in L2

This study examined whether processing ambiguous sentences containing relative clauses (RCs) following a complex determiner phrase (DP) by Persian-speaking learners of L2 English with different proficiency and working memory capacities (WMCs) is affected by semantic priming. The semantic relationship studied was one between the subject/verb of the main clause and one of the DPs in the complex D...

متن کامل

Relative Clause Ambiguity Resolution in L1 and L2: Are Processing Strategies Transferred?

This study aims at investigating whether Persian native speakers highly advanced in English as a second language (L2ers) can switch to optimal processing strategies in the languages they know and whether working memory capacity (WMC) plays a role in this respect. To this end, using a self-paced reading task, we examined the processing strategies 62 Persian speaking proficient L2ers used to read...

متن کامل

Lexical Ambiguity Resolution for Turkish in Direct Transfer Machine Translation Models

This paper presents a statistical lexical ambiguity resolution method in direct transfer machine translation models in which the target language is Turkish. Since direct transfer MT models do not have full syntactic information, most of the lexical ambiguity resolution methods are not very helpful. Our disambiguation model is based on statistical language models. We have investigated the perfor...

متن کامل

Computational Complexity of Probabilistic Disambiguation

Recent models of natural language processing employ statistical reasoning for dealing with the ambiguity of formal grammars. In this approach, statistics, concerning the various linguistic phenomena of interest, are gathered from actual linguistic data and used to estimate the probabilities of the various entities that are generated by a given grammar, e.g., derivations, parse-trees and sentenc...

متن کامل

Constraint-based and graph-based resolution of ambiguities in natural language

This thesis develops the theory of dominance constraints, a family of logical languages that describe trees, and applies them as a formalism for the underspecified description of scope ambiguities in natural language. In underspecification approaches to ambiguity resolution, all readings of a sentence at once are represented in an underspecified description, and are only enumerated by need. On ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004